Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

modern lstm #2752

Merged
merged 15 commits into from
Feb 3, 2025
Merged

modern lstm #2752

merged 15 commits into from
Feb 3, 2025

Conversation

wangjiawen2013
Copy link
Contributor

Pull Request Template

Checklist

  • Confirmed that run-checks all script has been executed.
  • Made sure the book is up to date with changes in this PR.

Related Issues/PRs

Provide links to relevant issues and dependent PRs.

Changes

Summarize the problem being addressed and your solution.

Testing

Describe how these changes have been tested.

Copy link

codecov bot commented Jan 28, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 83.62%. Comparing base (9f00320) to head (af75564).
Report is 1 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #2752   +/-   ##
=======================================
  Coverage   83.62%   83.62%           
=======================================
  Files         825      825           
  Lines      108686   108686           
=======================================
  Hits        90893    90893           
  Misses      17793    17793           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@wangjiawen2013
Copy link
Contributor Author

wangjiawen2013 commented Jan 28, 2025

Here is a test result, the results maybe a little different each time because the datasets were randomly generated.

屏幕截图 2025-01-28 161440

This implementation is complementary to Burn's official LSTM, users can choose either one depends on the project's specific needs.

@laggui laggui self-requested a review January 28, 2025 18:08
Copy link
Member

@laggui laggui left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great example! 🙂 I have some minor changes in mind.

We should also add it to the list of examples in the README and book.

The example could be restructured to follow the workspace examples (similar to what I did in your last wgan example), but that's not mandatory.

examples/modern-lstm/README.md Outdated Show resolved Hide resolved
examples/modern-lstm/README.md Outdated Show resolved Hide resolved
examples/modern-lstm/src/main.rs Outdated Show resolved Hide resolved
examples/modern-lstm/src/model.rs Outdated Show resolved Hide resolved
examples/modern-lstm/Cargo.toml Outdated Show resolved Hide resolved
@wangjiawen2013
Copy link
Contributor Author

Good! I am not so familiar with Rust and Burn as you. I am glad that you can revise the code. I'll learn from you on the design logic and formatting.

@wangjiawen2013
Copy link
Contributor Author

Good! I am not so familiar with Rust and Burn as you. I am glad that you can revise the code. I'll learn from you on the design logic and formatting.
I am not clear about the working mechanisms of github. I find that you have modified the code in my pull request. Do I need modify the source code and pull a request again, or you will make all the changes you mentioned instead of me ? If I need to make the changes you suggest, I will modify and test it in my fork and pull a request again.

@laggui
Copy link
Member

laggui commented Jan 30, 2025

I am not clear about the working mechanisms of github. I find that you have modified the code in my pull request.

The specific changes for the Cargo.toml were pretty short so I was able to suggest the code changes from the github PR review, and as you can see it automatically committed them to your fork when you applied them.

Do I need modify the source code and pull a request again, or you will make all the changes you mentioned instead of me ? If I need to make the changes you suggest, I will modify and test it in my fork and pull a request again.

The contribution process usually happens in a sequential manner. A user (you) submits code changes, a contributor (me) reviews the code and requests modifications where needed, and the user can provide his thoughts on the requests and make the necessary changes before asking for another round of review. And the process is typically repeated. But everything remains as part of the same pull request 🙂

In your last PR, the only changes required were structural (change how the example needed to be exposed in the workspace) so I made the changes myself since I had time.

I'll let you make the changes, but if you need help let me know!

@wangjiawen2013
Copy link
Contributor Author

wangjiawen2013 commented Jan 30, 2025

It's weired that the model performance decreased a little after these changes. Is the latest burn still unstable or the crates in the Cargo.toml will affect the performance ? I didn't change any parameters and the model. The only things changed are Cargo.toml and the shell commands. I tested the orginal one again, it indeed worked better than this new one.
This is the new one:
new

I think I find the reason. I copied the code from the wgan-generate.rs (https://github.com/tracel-ai/burn/blob/main/examples/wgan/examples/wgan-generate.rs), where the autodiffbackend was used. But in modern-lstm inference, I think the backend without autodiff must be used, otherwise the predictions will be erroneous because of the layernorm and dropout. Now it works as expected when using backend without autodiff for inference.

@laggui
Copy link
Member

laggui commented Jan 31, 2025

I think I find the reason. I copied the code from the wgan-generate.rs (https://github.com/tracel-ai/burn/blob/main/examples/wgan/examples/wgan-generate.rs), where the autodiffbackend was used.

Oh good catch! The structure likely was carried over from another example, but I just fixed it in #2736.

I'll fix the wgan example.

Copy link
Member

@laggui laggui left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed some conflicts with the main branch.

Thanks for the great example 🙂 we might add this LSTM implementation to our actual modules in the future

@laggui laggui merged commit e2fa935 into tracel-ai:main Feb 3, 2025
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants